Back

Microbial Genomics

Microbiology Society

Preprints posted in the last 7 days, ranked by how well they match Microbial Genomics's content profile, based on 204 papers previously published here. The average preprint has a 0.11% match score for this journal, so anything above that is already an above-average fit.

1
Integrating patient movement and pathogen genomics to support hospital infection prevention with PathoPath: a method development study

Sajib, M. S.; Tanmoy, A. M.; Kanon, N.; Jui, A. B.; Islam, M. S.; Dola, N. Z.; Hossain, M. M.; Mobarak, R.; Shahidullah, M.; Hoque, M.; Ahmed, A. N. U.; Holmes, A. H.; Saha, S. K.; Saha, S.; Wan, Y.; Hooda, Y.

2026-06-05 infectious diseases 10.64898/2026.06.03.26354630 medRxiv
Top 0.2%
10.1%
Show abstract

Background Healthcare-associated infections pose a major burden to neonatal health worldwide and remain difficult to track in low-resource hospitals because patient movement data and pathogen genomic data are rarely integrated into actionable transmission models. Existing approaches are often restricted to specific settings, highly structured electronic health records (EHRs), or analyses focused on either patient movements or pathogen characteristics alone. To address this gap, we developed PathoPath, an open-source integrative modelling platform, and evaluated its utility in a high burden paediatric hospital in Dhaka, Bangladesh. Methods PathoPath is an open-source R package that combines electronic health records with whole genome sequencing data to generate contact networks from direct and indirect contacts using minimal structured inputs. We retrospectively applied PathoPath to 373 cases of Klebsiella pneumoniae species complex (KpSC) infection identified in 2021 at the largest paediatric referral hospital in Dhaka, Bangladesh. Ward level patient movement trajectories were used to reconstruct contact networks, and genomic data from isolates from children <60 days were integrated to identify probable dissemination of bacterial clones and antimicrobial resistance plasmids. Findings PathoPath identified 750 direct contacts among 317 patients, forming 25 connected components, with the largest including 93 patients. KpSC infections were identified across 21 of 37 wards, with the neonatal intensive care unit accounting for 77.9% of all cases. Integration of genomic and network data distinguished sustained clustering of ST147 from multiple probable inter-clonal dissemination events involving IncFII plasmids carrying blaNDM-5 and/or blaOXA-181 within ST16. Four dominant sequence types accounted for 65.6% of sequenced isolates, and carbapenemase genes were detected in 95.8%. Interpretation PathoPath reconstructs hospital-wide contact networks and integrates them with pathogen genomics to map probable dissemination of pathogens and antimicrobial resistance using minimal structured clinical data. It could support more targeted infection prevention and control in hospitals where granular digital records are not available.

2
Compositional microbiome-based signatures associate with general health status: findings from a large population-based cohort study

Pujolassos, M.; Kurilshikov, A.; Weersma, R. K.; Yang-Fu, J.; Zhernakova, A.; Calle, M. L.

2026-06-04 epidemiology 10.64898/2026.06.03.26354796 medRxiv
Top 1%
1.6%
Show abstract

While microbiome is increasingly recognized as crucial for human health, translating this knowledge into effective healthcare and preventive strategies remains challenging. Many studies focus on identifying changes in microbiome composition associated with disease and evaluating the potential of such disease-associated microbial profiles as biomarkers for disease diagnosis. Under the hypothesis that microbiome dysbiosis may reflect physiological alterations present long before disease onset, in this work, we analyse the potential of disease-specific microbial signatures not as a diagnostic tool when the disease is already present, but as a means of health assessment in the general population. Moreover, instead of trying to define a single health measure, we believe it is necessary to consider several ways in which the microbiome departs from health, according to different disease-related physiological changes. To evaluate our assumptions, we designed a two-stage study: the identification of disease-specific microbial signatures (discovery stage) and, subsequently, the study of their distribution in the general population to assess associations with general health (external validation stage). Specifically, in the discovery phase we characterized 16 disease-specific bacterial signatures from large public microbiome data using a compositional data analysis methodology. In the second phase, we quantified these microbial signatures in the Lifelines-DMP cohort, a large population-based cohort, and evaluated their association with self-reported health status. Results indicate that most disease-specific microbial signatures associate with health status, supporting our assumption that microbial composition can capture physiological alterations before disease onset, and highlighting the importance of considering multiple ways in which microbiome departs from a healthy state. These findings reaffirm the potential of microbial information as an additional tool in preventive medicine.

3
A New Mixed Frequency Regression Model For Environmental Epidemiology

Shukla, N.; Bartington, S. E.; Hansell, A. L.; Lucas, T. C.

2026-06-04 epidemiology 10.64898/2026.06.03.26354801 medRxiv
Top 2%
0.8%
Show abstract

Background: In the absence of high-resolution response data, exposure-response modelling often relies on aggregated low-frequency exposure data, leading to loss of high-resolution information. Mixed Data Sampling (MIDAS) from econometrics offers an alternative but is limited due to its inability to make high-resolution predictions, inflexible likelihoods and penalised nonlinear functions, and limited visualization options. We propose a mixed-frequency Distributed Lag Non-linear Model (mf-DLNM) which can eliminate the need to aggregate exposure data in environmental epidemiology and provide high resolution predictions for time series studies. Methods: We evaluated the inference and predictive performance of the mf-DLNM. To evaluate its ability to estimate exposure-response relationships, we applied mf-DLNM and same-frequency (sf)-DLNM using data from the West Midlands, UK. Additionally, we compared the predictive performance of mf-DLNM with sf-DLNM and MIDAS across nine regions of England. As MIDAS cannot predict at the resolution of the predictor (daily), we compared the predictive performance of mf-DLNM and MIDAS at weekly resolution. To test the model's ability to predict high temporal resolution risk (daily), we compared sf-DLNM (with access to daily mortality counts) with mf-DLNM (with access only to weekly mortality counts). Results: In the West Midlands example, mf-DLNM performed comparably to sf-DLNM in estimating daily risk of temperature on respiratory mortality. Furthermore, mf-DLNM and MIDAS exhibited similar performance for weekly predictions. For high-resolution predictions, mf-DLNM and sf-DLNM showed nearly similar performance, despite mf-DLNM having access only to low-resolution response data. Conclusion: This mixed-frequency approach in environmental epidemiology overcomes the limitations of predicting health risks using aggregated exposure data and provides estimates of high-resolution outcomes in the absence of high-frequency health outcome datasets.

4
KESOZI Digital Twin: Physics-Informed Neural Network for Independent Estimation and Prediction of Childhood Diarrheal Disease Burden in Kenya, Somaliland, and Zimbabwe

KESOZI Digital Twin, ; Agumba, J. O.; Namusonge, L.; Ogendo, J.; Hassan, M. A.; Pembere, A.; Takavarasha, M.

2026-06-04 epidemiology 10.64898/2026.06.03.26354823 medRxiv
Top 3%
0.5%
Show abstract

Childhood diarrheal disease remains a leading cause of morbidity and mortality among children under five years in sub-Saharan Africa, particularly in settings affected by inadequate sanitation, climate variability, malnutrition, and limited healthcare access. Conventional forecasting approaches are often constrained by sparse surveillance data, weak spatial representation, and limited incorporation of mechanistic disease dynamics. This study presents a Physics-Informed Multimodal Artificial Intelligence Digital Twin framework that integrates Physics-Informed Neural Networks, Graph Neural Networks, diffusion-reaction epidemiological modeling, multimodal fusion learning, and Digital Twin simulation to estimate and predict childhood diarrheal disease burden in Kenya, Somaliland, and Zimbabwe. Using public epidemiological, environmental, climate, sanitation, and synthetic proof-of-concept datasets, the framework modeled temporal disease dynamics, spatial transmission, pathogen-attributed burden, and outbreak trajectories while enforcing epidemiological consistency through physics-informed optimization. Results demonstrated robust forecasting performance, enhanced spatial transmission modeling, uncertainty-aware predictions, and realistic outbreak simulations across the three countries. Rotavirus, Shigella, and Cryptosporidium were identified as major contributors to modeled mortality burden, while unsafe water exposure, poor sanitation, malnutrition, and climate-sensitive transmission substantially increased disease risk. Compared with a Bayesian baseline model, the multimodal framework achieved superior nonlinear risk characterization, geospatial learning, and temporal prediction. These findings highlight the potential of scientific machine learning and digital twin systems for infectious disease surveillance, outbreak forecasting, climate-health analytics, and evidence-based public health decision-making in low-resource African settings. Keywords: Physics-Informed Neural Networks, Graph Neural Networks, Digital Twin, Childhood Diarrheal Disease, Epidemiology, Kenya, Somaliland, Zimbabwe, Scientific Machine Learning, Spatial Epidemiology, Multimodal Fusion

5
Spatial and temporal associations between animal ownership and malaria prevalence in Africa using cross-sectional national Demographic and Health Surveys

Topazian, H. M.; Morgan, C. E.; Goel, V.

2026-06-08 epidemiology 10.64898/2026.06.05.26355017 medRxiv
Top 4%
0.3%
Show abstract

Use of zooprophylaxis as a malaria control strategy has been recommended historically, but a complex relationship exists between animal ownership and malaria infection, with mixed associations described in the literature. We sought to characterize this relationship spatially and temporally in malaria-endemic regions of Africa. We used data from 392,843 individuals from 66 Demographic and Health surveys from countries within Africa to investigate the association between household animal ownership and Plasmodium infection. We used Bayesian models with Integrated Nested Laplace Approximation to incorporate spatially varying coefficient processes, allowing the association of interest to vary over space, time, and within strata of vector species occurrence, land cover, and number of animals owned by households. Spatially varying intercept models showed that ownership of cattle, chickens/poultry, goats, horses/donkeys/mules, pigs, and sheep was broadly associated with malaria infection, with odds ratios ranging from 1.55 to 1.67. However, spatially varying slope models revealed considerable heterogeneity, with odds ratio estimates for all animal types demonstrating both protective and harmful effects varying from 0.33 to 3.33 both subnationally and across time. We found no evidence that modification by vector species, number of animals owned, and land cover fully explained the variation in estimates. Unobserved localized cultural, behavioral, or ecological factors likely modify the association between animal ownership and malaria prevalence. Further exploring the nature of this relationship over space and time will be important to understanding how context-specific One Health dynamics between humans, animals and the environment affect malaria prevention and control efforts.

6
Physical activity, fatty acids, and MASLD risk: Behavioural and metabolic factors jointly shaping liver health in populations

Chen, F.; You, R.; Liu, Y.; Yin, Y.; Liu, A.; Deng, L.; Xie, B.; Fan, J.; Wang, W.

2026-06-08 epidemiology 10.64898/2026.06.05.26354982 medRxiv
Top 5%
0.2%
Show abstract

Background and Aims: MASLD has become the most prevalent chronic liver disease globally. Although MVPA and plasma fatty acids have been individually studied in relation to metabolic health, their independent and combined associations with MASLD incidence remain unclear. We aimed to investigate these associations. Methods: This study included 51,717 UK Biobank participants free of liver disease at baseline, with MVPA measured using wrist-worn accelerometers and plasma fatty acids quantified via NMR. Multivariable-adjusted Cox models and restricted cubic splines were used. Results: Over a median follow-up of 7.8 years, 472 incident cases were identified. In fully adjusted models, meeting recommended MVPA levels together with higher n-6 PUFA concentrations was associated with a 71% lower risk (HR 0.29, 95% CI 0.18-0.45). The MVPA-MASLD association was nonlinear, with risk reduction plateauing at approximately 189 minutes per week. Higher n-6 PUFA was associated with reduced risk, whereas n-3 PUFA showed no significant association. Conclusions: These findings suggest that behavioral and metabolic factors may jointly influence MASLD risk. Further studies in diverse populations are needed to confirm these associations.

7
Within-household transmission risk of pulmonary tuberculosis in the era of universal antiretroviral therapy

Khan, P. Y.; Govender, I.; McCreesh, N.; Sithole, M.; Mkwanzai, E.; Sweeney, S.; Ording-Jespersen, G.; Wong, E. B.; Hanekom, W.; Houben, R. M. G. J.; White, R. G. M. G. J.; Smit, T.; Smith, M. J.; Fielding, K.; Grant, A. D.

2026-06-09 epidemiology 10.64898/2026.06.01.26354571 medRxiv
Top 5%
0.2%
Show abstract

Background Tuberculosis remains the leading infectious cause of death worldwide. In the WHO African region, declining incidence has coincided with antiretroviral therapy (ART) scale-up, though whether this reflects reduced progression to disease or reduced transmission is unclear. We evaluated how ART and symptom status influence within-household Mycobacterium tuberculosis complex (MTBC) transmission risk. Methods We conducted a case-contact household study in rural South Africa, enrolling index adults with bacteriologically-confirmed pulmonary tuberculosis. MTBC immunoreactivity was measured in all child household contacts (aged 2-14 years) as a proxy measure of within-household transmission. We assessed the influence of index person ART status and symptom status, and explored effect-measure modification of the association between index person HIV status and transmission risk by sex. Results Among 755 child contacts of 296 index persons, effective ART was not associated with within-household MTBC transmission risk (risk ratio [RR], 1.07; 95% CI, 0.66-1.74). Among PLHIV engaged in ART care, WHO TB four-symptom screen (WHO4SS) status was not associated with transmission risk (RR, 0.80; 95% CI, 0.43-1.47), although absence of reported cough reduced risk (RR, 0.61; 95% CI, 0.38-0.96). A pronounced interaction between sex and HIV status was observed: HIV-negative women had the highest within-household MTBC transmission risk (30.5% vs. 14.3% in women with HIV) whereas risks were similar between HIV-positive and HIV-negative men. Conclusions We found no evidence that effective ART or WHO4SS status influenced within-household MTBC transmission risk, though confidence intervals were wide. Absence of reported cough was associated with lower risk, and transmission risk was highest among child contacts of HIV-negative women. These findings suggest reported cough is a useful marker of transmission risk and that routine tuberculosis screening within ART care may reduce transmission from PLHIV; intensified efforts are nonetheless needed to achieve earlier tuberculosis detection in HIV-negative individuals.

8
Revisiting Plasmodium vivax molecular correction

Taylor, A. R.; Foo, Y. S.; White, M. T.

2026-06-04 infectious diseases 10.64898/2026.06.02.26354709 medRxiv
Top 6%
0.1%
Show abstract

Background: Reliable inference of Plasmodium vivax recurrence states - relapse, recrudescence and reinfection (the ``3Rs'') - improves estimates of antimalarial efficacy. The R package Pv3Rs features a Bayesian model designed for P. vivax molecular correction, i.e., using parasite genetic data to infer recurrence states. The model is an extension of a prototype built to analyse microsatellite data from the Vivax History (VHX) and Best Primaquine Dose (BPD) trials. Methods: We re-analysed data from 212 VHX and BPD trial participants (493 recurrences) using Pv3Rs, comparing results with those from the prototype and with genetic relatedness estimated using Dcifer, a tool for estimating relatedness based on identity-by-descent. Posterior recurrence state probabilities were computed using both uniform and time-to-event priors: artificial but equal prior probabilities facilitate posterior interpretation, while time-to-event priors leverage all available information and enable re-computation of failure rates. Relatedness estimates were used to identify and correct instances of model misspecification. Results: The Pv3Rs model generated posterior probabilities for all recurrences and was able to jointly model data on all episodes per participant for 89% of participants, compared with 73% using the prototype. Recurrence state probabilities were broadly consistent across methods, though the Pv3Rs model elevated reinfection probabilities slightly. Relatedness estimates exposed various outliers consistent with half-sibling parasites and/or genotyping errors. Outlier correction impacted some per-participant failure probabilities, but reinfection-adjusted radical-cure failure rates of high-dose primaquine remained near 3%, in line with previous findings. Conclusion: Re-analysis of VHX and BPD P. vivax genetic data restates earlier reinfection-adjusted efficacy estimates. It demonstrates the increased computational capability and misspecification sensitivity of Pv3Rs, highlighting a need for careful analyses. Using relatedness-based diagnostics alongside model-based inference, we were able to harness the advantages of model-based inference and provide a framework for future P. vivax molecular correction.

9
Insights from Wastewater Surveillance of SARS-CoV-2 in Skilled Nursing Facilities: Comparing Virus Concentration Methods for Wastewater and Correlating Wastewater Virus Concentrations with Clinical Infections, Georgia, USA, 2022

Whitehill, F.; Lyons, A. K.; Abera, B.; Adler, C.; Burgos-Garay, M.; Campbell, M.; Santiago, A. J.; Ganim, C.; Moore, J.; Cahela, Y.; Lenz, S.; Gable, P.; Medrzycki, M.; Walters, M. S.; Keaton, A.; Cook, P. W.; Li, Y.; Tao, Y.; Zhang, J.; Malapati, L.; Retchless, A. C.; Tong, S.; Williams, M.; Donlan, R.; Coulliette-Salmond, A.

2026-06-04 epidemiology 10.64898/2026.06.01.26354622 medRxiv
Top 7%
0.1%
Show abstract

To understand the utility of healthcare facility-level wastewater surveillance (WWS) for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), it is important to correlate wastewater SARS-CoV-2 RNA detection with the number of clinical infections. WWS for SARS-CoV-2 was performed at three skilled nursing facilities (SNFs) over 25 weeks. Electronegative membrane filtration (enMF) and Nanotrap(R) Magnetic Virus Particles (NP) virus concentration methods were compared. Extracts were tested by droplet digital polymerase chain reaction. Spearman's correlations ({rho}) between wastewater virus RNA concentrations and infection counts were calculated. From split wastewater samples, enMF recovered higher SARS-CoV-2 RNA concentrations than NP. Combining data from all facilities, the median concentrations were 53.0 versus 38.6 gc/100 mL for enMF and NP, respectively (p=0.001). Using enMF, correlations were moderate to strong at SNF A ({rho} ranged 0.67 to 0.86, all p-values <0.001). Weak to moderate correlations can be explained by the sampled manhole not representing the entire facility (SNF B, {rho} ranged 0.47 to 0.72, p-values ranged <0.001 to 0.12) and longitudinal data gaps from summer heat and equipment maintenance (SNF C, {rho} ranged 0.14 to 0.59, p-values ranged 0.52 to <0.01). WWS can be a valuable tool for tracking dynamics of SARS-CoV-2 infections in healthcare facilities.

10
Title: Development of a Human Papillomavirus genotype-informed risk-stratification model to improve Cervical Cancer screening in resource-limited settings: a cross-sectional study

Kambou Kountchou, K. D. K. K.; Tommo Tchouaket, M. C.; Moko Fotso, L. G.; Fokou Bomgning, B. N.; Fippo Fitime, L.; Talom Teumadjou, A.; Routoube, M.; Efakika Gabisa, J.; Ngoufack Jagni Semengue, E.; Nka, A. D.; Kae, A. C.; Dobgima Pisoh, W.; Deutou, L.; Takou, D.; Fainguem, N.; Sosso, S. M.; Kamgaing Simo, R.; Yagai, B.; Tabola Fossa, L.; Perno, C.-F.; Colizzi, V.; Enow-Orock, G.; Fokam, J.; Terrinoni, A.; Kuiate, J.-R.

2026-06-10 pathology 10.64898/2026.06.06.26355059 medRxiv
Top 7%
0.1%
Show abstract

Background: In resource-limited settings, a critical bottleneck in cervical cancer prevention is the lack of practical strategies to triage high-risk human papillomavirus (HR-HPV)- positive women. Therefore, this study aimed to develop and internally validate a genotype-specific risk stratification model. Methods: A cross-sectional study enrolled 555 women in Cameroon. Data collection integrated cervical cytology and HPV genotyping using Abbott m2000rt and Sacace multiplex systems. An iterative modeling approach with bootstrap validation was used to develop the model and address model instability. HR-HPV genotypes were transformed into a hierarchical risk variable due to sparsity and integrated with significant predictors. The final model was translated into a scoring system, and the risk gradients and performances were evaluated at two thresholds. Data was analyzed using SPSS 27.0. Results: The mean age was 44.8 years, and the prevalence of HR-HPV was 26.5% (147/555). The final model, incorporating HPV categories, age, and tobacco, demonstrated moderate discriminative ability (AUC=0.702, 0.642-0.762) with a good calibration (Hosmer-Lemeshow {chi}{superscript 2}=4.05, p=0.399). The scoring system assigned women to risk groups based on their total scores which produced a clear monotonic risk gradient; the observed probability of high-grade lesions/cancer ranged from 15% (score 0) to >65% (score [&ge;]4). At a conservative threshold ([&ge;]4 points), 4.7% (26/555) of women were classified as high-risk, concentrating 46% (6/13) of cancers (positive predictive value[PPV]=58%) while a sensitive threshold ([&ge;]3 points) had 16.8% (93/555) high-risk, concentrating 77% (10/13) cancers (PPV=38%). Both thresholds maintained a high negative predictive value (>95%). Conclusion: This bootstrap-validated, risk-stratification tool is a proof-of-concept in resource limited settings that assigns HR-HPV-positive women to distinct management pathways using three variables. After refining through a longitudinal study and external validation, this scoring system can improve the efficiency of cervical cancer screening programs in low-resource settings.

11
Temporal and climatic drivers of uncomplicated malaria in Ghana: A Region Generalised Additive Model analysis.

Akurugu, E.; Awine, T.; Seidu, B.; Peprah, N. Y.; Mohammed, W.; Boateng, P.; Abiwu, P. H. A. K.; Silal, S. P.

2026-06-09 infectious diseases 10.64898/2026.06.06.26355054 medRxiv
Top 8%
0.1%
Show abstract

Abstract Background Malaria remains a major public health challenge in Ghana, despite recent reductions in cases due to various interventions. The endemicity of the disease varies across regions, influenced by diverse seasonal and temporal factors that support mosquito proliferation and malaria cases. This study used a Generalised Additive Models to explore the impact of weather conditions on malaria cases in Ghana. Methods Generalised Additive Models were used to examine the nonlinear effects of weather conditions on malaria cases. Monthly aggregated malaria cases from the District Health Information Management System II and average monthly rainfall and temperature data from the Ghana Meteorological Agency were analysed, covering 2012 to 2023. Regional Generalised Additive Models incorporating weather variables were developed, fitted, and validated against observed data using model diagnostics to identify the most suitable model for each region. Results The analysis revealed complex temporal patterns in malaria cases across Ghana, influenced by seasonal and long-term trends. Regions constituting the Coastal and Transitional Forest zones exhibited bimodal peak malaria seasons, while the Guinea Savannah showed a unimodal peak. Significant interactions between rainfall and temperature were identified, particularly in the Eastern region, where higher rainfall combined with temperatures around 27-28 {degrees}C were associated with higher malaria cases, reflecting the complex and region-specific nature of meteorological influences. Conclusions The findings point to the dynamic and heterogeneous nature of malaria caseloads in Ghana, emphasising the need for region-specific control strategies tailored to local climatic conditions. A key recommendation is the systematic integration of meteorological data into the National Malaria Data Repository to enable continuous monitoring of climatic influences and support timely, evidence-based intervention decisions. Future research should incorporate socio-economic factors, intervention coverage data, vector surveillance, and demographic characteristics into mathematical modelling frameworks for a more comprehensive understanding of malaria cases in Ghana.

12
Limitations of cross-border containment strategies for Bundibugyo ebolavirus

Middleton, C.; Larremore, D.

2026-06-08 epidemiology 10.64898/2026.06.04.26354820 medRxiv
Top 8%
0.1%
Show abstract

An ongoing outbreak of Bundibugyo virus disease (BVD) in the Democratic Republic of the Congo was deemed a public health emergency of international concern in May 2026. To prevent cross-border importation, many countries, including the United States, Canada, India, Thailand, and Kenya have already proposed containment strategies, and others are likely to follow suit. How well (or poorly) are screening and quarantine containment measures are likely to work? We leverage established epidemiological theory and develop a mathematical model of traveler screening and post-arrival quarantine for BVD to answer this question. We find that traveler screening via symptom screening or molecular testing will miss the majority of infected travelers, and should be complemented by post-arrival quarantine and monitoring of sufficient duration to detect those with long incubation periods. Our findings underscore the limitations of border screening and the importance of complementary measures like post-arrival quarantine to prevent local importation of BVD.

13
Multi-region sampling of the human small intestine using an ingestible device

Fu, B.; DeSchepper, L. B.; Sun, J.; McKeithen-Mead, S. A.; Kapili, B.; Ochoa-Andersen, P.; Spencer, S. P.; Fardeen, T.; Ricardo, M.; El Kamari, V.; Sinha, S.; Relman, D. A.; Grembi, J. A.; Shalon, D.; Estrela, S.; Huang, K. C.

2026-06-10 gastroenterology 10.64898/2026.06.09.26353912 medRxiv
Top 9%
0.0%
Show abstract

The human small intestine (SI) plays a central role in nutrient processing, host-microbe interactions, and immune regulation, yet remains poorly characterized due to the lack of minimally disruptive sampling methods. Here, we present a protocol for deploying, recovering, and analyzing samples collected using an ingestible device that enables multi-region, lumen-targeted SI sampling during normal digestion. The device incorporates a ~30-cm collapsible tube wound into pH- or time-responsive layers that sequentially unfurl in situ, typically capturing three spatially ordered samples with high yield and reliable retrieval. This protocol outlines study design, participant handling, device recovery, contamination control, and standardized workflows for analyses, including cell quantification, culturomics, sequencing, and metabolomics. We further describe benchmarking approaches for evaluating spatial resolution and strategies for assay prioritization when sample volume is limiting. By reducing participant burden and facilitating integration with stool, saliva, and clinical metadata, this approach enables longitudinal and large-cohort studies linking SI microbial ecology and host physiology to human health.

14
Assessing the impact of absence of coordination in malaria intervention strategies: a modelling study

Iggidr, Y.; Ruktanonchai, N. W.; Benhana, B.; Turbe, V.; Bauzile, B.; Ward, A.; Cohen, J.; Pothin, E.; Champagne, C.

2026-06-05 epidemiology 10.64898/2026.06.03.26354857 medRxiv
Top 10%
0.0%
Show abstract

Malaria control programs are increasingly tailored at subnational scales; however, neighboring areas remain connected through human mobility, allowing parasite importation that may undermine independently timed interventions. Although the spatial targeting of control has been the focus of extensive research, the epidemiological consequences of temporal misalignment in intervention deployment across interconnected regions remain to be elucidated. We investigate how asynchronous timing of malaria interventions affects transmission dynamics using a two-patch susceptible-infected-susceptible metapopulation model. We compare synchronous and asynchronous intervention schedules and quantify their impact using measures of excess cumulative incidence attributable to asynchrony. The measure that will be used for this purpose is referred to as Asynchrony Induced Growth (AIG). Across a range of 10,000 parameter combinations, asynchronous implementation has been observed to result in a heightened incidence compared to synchronized deployment, though the impact is typically negligible in most endemic settings. Sensitivity analyses indicate that the impact is most significant when interventions are highly effective, infectious duration is brief, and transmission intensity approaches the elimination threshold. In such circumstances, asynchrony has the potential to substantially inflate case numbers, delay transmission interruption, or even prevent elimination entirely. In illustrative scenarios that reflect realistic settings, synchronizing interventions has been shown to avert large numbers of infections and shorten elimination timelines by years to decades. These findings demonstrate that, beyond spatial targeting, temporal coordination of interventions across connected areas can meaningfully enhance malaria control and elimination. Coordinated timing may be particularly valuable for cross-border or near-elimination programs and should be considered in operational planning and resource allocation.

15
A Decade of the Center for Disease Control and Prevention's FluSight Influenza Forecasting

Hines, A. G.; Mathis, S. M.; Johansson, M. A.; Biggerstaff, M.; Reed, C.; Borchering, R.

2026-06-08 epidemiology 10.64898/2026.06.05.26354941 medRxiv
Top 11%
0.0%
Show abstract

Since the U.S. 2013/14 influenza season, the CDC's FluSight Challenge has provided a platform for evaluating influenza forecasting models and fostering collaboration across institutions. The Challenge aims to improve the science and enhance the utility of infectious disease forecasts for public health decision making. We analyzed ten years of submitted forecasts (2014/15-2019/20 (influenza-like illness seasons) and 2021/22-2024/25 (hospital admissions seasons)) across a range of model types, including statistical, mechanistic, machine learning, and hybrid models. Influenza-like illness (ILI) forecasts were evaluated using the exponentiated logarithmic score (skill metric) while hospital admissions forecasts were evaluated using the log transformed relative Weighted Interval Score. Corresponding potential performance differences were assessed using Wilcoxon rank-sum tests, and associations with team participation history were evaluated using Spearman's rank correlation. Model performance varied by season, and no single model type consistently outperformed others. In ILI seasons, statistical models generally performed better than mechanistic and machine learning models, though consistent differences were not observed in more recent hospital admissions seasons. Ensemble forecasts showed better overall performance across seasons, and the CDC's FluSight ensemble ranked among the top-performing forecasts every year. We also found a positive correlation between forecast accuracy and the number of years a team participated in the Challenge, with statistically significant associations in four seasons. These findings highlight the benefits of ensemble approaches and sustained engagement in improving forecasting performance, while also underscoring the continued value of forecast evaluation before and following the COVID-19 pandemic. Insights from the FluSight Challenge can guide future infectious disease forecasting efforts and support more effective public health preparedness.

16
Comparison of the Mini Parasep SF, ParaPak SpinCon, and Paradevice fecal filtration and concentration devices for microscopic and AI-assisted detection of intestinal parasites

Morris, H.; Pritt, B. S.

2026-06-04 infectious diseases 10.64898/2026.06.02.26354769 medRxiv
Top 11%
0.0%
Show abstract

Effective filtration and concentration of stool specimens is an essential pre-analytical step for reducing fecal debris and improving organism recovery using microscopy-based ova and parasite (O&P) examination. This study evaluated three commercially available fecal sedimentation-based filtration/concentration systems, ParaPak SpinCon (Meridian Bioscience), Mini Parasep SF (Apacor), and the newly-available ParadeviceReingenuity), for qualitative parasite detection and workflow logistics using conventional and artificial intelligence (AI)-assisted microscopy. Forty clinical stool specimens (20 parasite-positive and 20 parasite-negative) were processed with the 3 devices, and the resultant 120 wet mount and 120 trichrome stained smear preparations were examined using conventional microscopy. Trichrome-stained slides were also scanned at 40x magnification using a Hamamatsu NanoZoomerS360 flatbed digital slide scanner and images were analyzed using the Techcyte Fusion Human Fecal Trichrome AI algorithm. Positive and indeterminate digital findings were confirmed by conventional glass slide microscopy. Slides and digital images were reviewed in a blinded manner. Concordance was assessed among the 360 initial evaluations (microscopy and AI-assisted), and discrepant parasitology results were resolved through re-review and specimen reprocessing as needed. Final qualitative agreement across slide/image evaluations using all three concentration systems was 100%. Minor discrepancies in protozoan and white/red blood cell detection/identification were noted in 5 and 7 cases, respectively, and likely reflected sampling and observer variability. While the three concentration systems produced equivalent qualitative results, the Paradevice and Mini Parasep SF offered the most streamlined workflows. These findings support the Paradevice and Mini Parasep SF as efficient, analytically equivalent systems that are compatible with traditional and AI-assisted O&P workflows.

17
Spatiotemporal Dynamics of Human Metapneumovirus and Potential Impact of Respiratory Syncytial Virus Interventions in the United States

Li, K.; Perniciaro, S.; Kwon, J.; Grubaugh, N. D.; Weinberger, D. M.; Pitzer, V. E.

2026-06-04 infectious diseases 10.64898/2026.06.01.26354616 medRxiv
Top 11%
0.0%
Show abstract

Human metapneumovirus (HMPV) causes acute lower respiratory infections, primarily affecting young children and older adults, with seasonal outbreaks peaking annually in March or April in the United States and other temperate regions in the Northern hemisphere. However, the factors driving HMPV seasonality in the United States remain poorly understood. We analyzed laboratory-confirmed HMPV cases and age-specific emergency department visits across 10 US regions, fitting an age-stratified dynamic transmission model to assess spatiotemporal patterns and investigate the influence of environmental variables and viral interference from RSV on HMPV transmission rates. We found that models incorporating climate variables into the transmission rate, including vapor pressure, precipitation, potential evapotranspiration, and minimum temperature, could not capture the timing of HMPV activity across all regions. Instead, HMPV timing was associated with RSV activity, with the HMPV transmission rate reduced in the presence of RSV. We showed that, unlike RSV, only models incorporating viral interference could reproduce the biennial pattern of HMPV observed in some regions, characterized by alternating late-small and early-large epidemics. Furthermore, our model successfully reproduced post-COVID-19 HMPV and RSV epidemics and predicted that RSV interventions are not likely to lead to a substantial increase in HMPV activity despite decreasing competition from RSV. Our work unravels the spatiotemporal dynamics of HMPV and its interaction with RSV, informing future seasonal forecasting and intervention strategies for HMPV.

18
Emergence and Spread of Artemisinin-Resistant Malaria in Zambia

Mwenda, M.; Oliveira, R.; Mambwe, B.; Chiyesu, C.; Bohmeier, B.; Mosler, K.; Phiri, M.; Sinyoolo, A.; Chiposa, V.; Namonje, T.; Munsanje, M.; Ilunga, M.; Chirwa, C.; Mwape, I.; Mumba, D.; Coppee, R.; Stoica, M.-A.; Veiga, M. I.; Drakeley, C.; Pearson, R.; Verity, R.; Chirwa, J.; Mockenhaupt, F. P.; Vvn Loon, W.; Portugal, S.; Simulundu, E.; Bwalya, S.; Miller, J. M.; Chilengi, R.; Fanaka, C.; Bridges, D. J.; Hawela, M.; Hendry, J. A.

2026-06-10 infectious diseases 10.64898/2026.06.04.26354343 medRxiv
Top 11%
0.0%
Show abstract

Background Artemisinin derivatives are central to first-line treatment of both uncomplicated and severe Plasmodium falciparum malaria. Emerging artemisinin partial resistance in East Africa threatens to spread across the continent. Methods In two cross-sectional studies in Zambia in 2024, we genotyped the artemisinin resistance-associated gene Pfkelch13. In Kaoma, western Zambia, we evaluated the percentage of patients with day-3 parasite positivity following treatment with artemisinin-based combination therapy, and ex vivo parasite susceptibility to dihydroartemisinin (the active metabolite of artemisinin). We also assessed longitudinal changes in Pfkelch13 mutation prevalence in Kaoma using isolates collected from 2018 through 2026. Results We identified a novel mutation, Pfkelch13 A724E, in 52% (113 of 217) of isolates from Western Province, 51% (94 of 184) of isolates from North-Western Province, and 11.7% (229 of 1,949) of isolates country-wide. In Kaoma, 28% (21 of 75) of patients carrying Pfkelch13 A724E mutant parasites before treatment were parasite positive on day 3, compared with 0% (0 of 23) of patients with the wild-type allele (P=0.003). Within day-3 positive patients, the proportion of A724E mutant parasites increased significantly after treatment (P = 0.013). The prevalence of Pfkelch13 A724E in Kaoma increased steadily from 0% (95% confidence interval [CI], 0 to 22%) in 2018 to 79% (95% CI, 73 to 85%) in 2026. Conclusions A novel Pfkelch13 mutation conferring partial resistance to artemisinin is spreading in Zambia. Additional clinical evaluations are urgently needed in the region. (Funded by the Gates Foundation, INV-048316).

19
Beyond Injection Detection: A Positive-Security Prompt Firewall that Closes the Scope and PHI Gap SOTA Classifiers Miss in Healthcare

Schwoebel, J.; Semenec, I.; Rousseva, J.; Frasch, M. G.; Thorstenson, R.; Bhatt, M.

2026-06-06 health systems and quality improvement 10.64898/2026.06.04.26354950 medRxiv
Top 12%
0.0%
Show abstract

Large language models embedded in autonomous agents process trusted instructions and untrusted data in one context window, leaving them open to direct and indirect prompt injection. In healthcare this is not hypothetical: a 2025 JAMA Network Open study found commercial medical LLMs followed injected instructions in 94.4% of simulated patient encounters, including life threatening recommendations . Yet the clinically decisive problem we quantify here is different. Most real clinical threats protected health information PHI exfiltration, cross patient access, bulk export, out of scope advice are fluent, legitimate looking requests that carry no attack signal, so even a state of the art injection detector passes them. Existing runtime guardrails trade safety against latency: model based auditors are accurate but add hundreds of milliseconds of Python inference, while lexical filters are fast but blind to obfuscated or semantically disguised payloads. We present QFIRE, an inline, provider agnostic prompt firewall implemented as a single self contained Rust toolchain proxy, CLI, and benchmark harness. QFIRE combines three mechanisms: (i) positive security scope constraints, which restrict a model call to a declared natural language purpose and block out of scope drift even when no overt attack token is present; (ii) an asynchronous detector graph that runs N rules and their detector nodes concurrently, cheapest checks first; and (iii) a de obfuscation pass that decodes Base64 hex ROT13, folds homoglyphs and leetspeak, and strips zero width characters before detection. QFIRE ships 106 versioned firewall rules and a dedicated HIPAA Safe Harbor 18 identifier PHI panel, and runs a local DeBERTa v3 injection classifier via embedded ONNX Runtime. On 1968 public prompt injection and jailbreak prompts QFIREs deterministic hybrid attains F1 0.86, statistically tied with Metas state of the art PromptGuard 2 0.86 and above protectai DeBERTa v3 0.83; lexical baselines lag 0.16 to 0.50. Our central result is on QFIRE HealthBench, a new 2000 prompt healthcare benchmark we build and release with real garak and Microsoft PyRIT payloads. There the same PromptGuard-2 recovers only 0.40 recall DeBERTa v3 0.57, because most clinical threats carry no injection signal; QFIREs combined scope plus PHI chain reaches 0.83 recall F1 0.87 at a calibrated 0.08 false positive rate. Generic injection detection, even state of the art, is therefore necessary but not sufficient for healthcare agents. A bare LLM judge also closes most of this static corpus gap F1 0.90; QFIREs contribution beyond static accuracy is auditable determinism, bounded latency, and adaptive robustness, where the bare judge falls to 34 to 59% recall section 5.5. End to end, placing QFIRE in front of a tool using agent over a mock EHR sandbox cuts the agents harmful action rate from 0.38 to 0.00 at a 0.13 benign utility cost. All code, rules, corpora snapshots, and scripts are released, and every table regenerates from a single make paper target against local models with no paid API keys.

20
Adapting a Regulation of Craving Magnetic Resonance Imaging Task to Generate Functional Repetitive Transcranial Magnetic Stimulation Targets for the Ventromedial and Dorsolateral Prefrontal Cortex in Treatment-Seeking Participants with Cannabis Use Disorder

Geoly, A.; McCalley, D. M.; Struckmann, W.; Azeez, A.; Wong, B.; Kim, B.; Ninomiya, S.; Ahmed, S.; Kim, J. P.; McRae-Clark, A. L.; Froeliger, B.; Sahlem, G. L.

2026-06-06 addiction medicine 10.64898/2026.06.04.26353616 medRxiv
Top 12%
0.0%
Show abstract

Background: Repetitive Transcranial Magnetic Stimulation (rTMS) is a promising treatment across addictive disorders including Cannabis Use Disorder (CUD). Targeting incentive-salience circuitry via the ventromedial prefrontal cortex (vmPFC) and central-executive circuitry via the left dorsolateral prefrontal cortex (LDLPFC) are both promising treatment approaches; however, to date structural targets have predominated whereas functional targeting may allow for more precision. In this pilot trial we adapted a functional Magnetic Resonance Imaging (fMRI) Regulation of Craving (ROC) task to generate fMRI-based rTMS targets in the vmPFC and LDLPFC. Methods: We recruited treatment-seeking participants with moderate or severe CUD as a part of an open-label trial and administered an adapted ROC-task during fMRI following 24-hours of cannabis abstinence. We identified sub-portions of maximal activation of the LDLPFC when participants thought of long-term consequences of cannabis use (Later) and of the vmPFC when participants thought of short-term positive aspects of cannabis use (Now). We hypothesized that our task would generate acceptable rTMS targets in >66% of baseline fMRI scans. Results: A total of 20-participants enrolled in the trial (50%F, age=33.3+9.8) and completed the baseline fMRI. The adapted ROC-task elicited group level activation in the LDLPFC and precuneus in the Later>Now and in the bilateral vmPFC, ACC, and striatum in the Now>Later contrast. Acceptable functional targets resolved in both the vmPFC and LDLPFC in 19 of 20 participants (one participant did not tolerate MRI). Conclusions: The adapted ROC-task elicits activation in incentive salience and central executive circuitry and can feasibly generate rTMS targets when using a cluster selection algorithm.